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Abstract 

The pair production of heavy fourth-generation quarks, which are predicted under the hypothesis of 
flavor democracy, is studied using tree-level Monte Carlo generators and fast detector simulation. Two 
heavy-quark mass values, 500 and 750 GeV, are considered with the assumption that the fourth family 
mixes primarily with the two light families. It is shown that a clear signature will be observed in the 
data collected by the ATLAS detector, after the first year of low-luminosity running at the Large Hadron 
Collider. 

1 Introduction 

It is well known that the number of fundamental fermion families (generations) is not fixed by the Standard 
Model (SM). The precision measurements performed by the Large Electron-Positron Collider experiments 
at the Z pole have shown that the number of families with light neutrinos (ra v < raz/2 ) is equal to three. On 
the other hand, the asymptotic freedom in QCD constrains this number to be less than nine. Therefore, from 
a pure experimentalist approach, it is meaningful to search for a possible fourth SM family at the forthcom- 
ing colliders. On the theoretical side, the fourth SM family is a direct outcome of the flavor democracy (or 
in other words democratic mass matrix) approach |Q] [2] which is strongly motivated by the naturalness 
arguments (see the review |4| and the references therein). Meanwhile, there are phenomenological argu- 
ments against the existence of a fifth SM family [5\. In this paper, the additional quark and lepton pairs of 
the fourth family are denoted as M4, d\ and e\, V4. 

The most recent limit on the mass of the M4 quark is m„ 4 > 256 GeV @. The partial wave unitarity gives 
an upper bound of about 1 TeV to the fourth family fermion masses Q. According to flavor democracy, 
the masses of the new quarks have to be within few GeV of each other. This is also experimentally hinted 
by the value of the p parameter which is close to unity |S). Therefore, if the fourth SM family exists, the 
Large Hadron Collider (LHC) will copiously produce its quarks [9 | and the proposed linear colliders will 
provide opportunity to discover its leptons fTOl . As the single production of the new quarks in LHC is 
suppressed as compared to their pair production, due to the small value of the CKM matrix elements, the 
latter is considered. The new quarks, being heavy, will decay to the known SM quarks and W bosons. The 
dominant decay channels are defined by the 4x4 extension of the CKM mixing matrix with two distinct 
possibilities: 

1) If the fourth family is primarily mixing with the third one, the decay channels will be 114 — > W + b and 
t&t — > W~t. The signature of the U4U4 production will be W + W~bb whereas in the case of ^4^4, the final 
state would have an additional pair. The former case has been studied in l9l [TTIl about 10 years 

The latter case, while potentially feasible owing to the low predicted SM backgrounds with four W 
bosons in the final state, is likely to be less interesting as a discovery channel, due to the difficulties in the 
jet association and invariant mass reconstruction. 

'Recently this process has been reconsidered in 1121 as "the best scenario for the LHC". 
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2) If the fourth generation is primarily mixing with the first two families, the dominant decay channels 
will be M4 — ► W + d/s and c/4 — ► W~u/c. In this case, since the light quark jets are indistinguishable, the 
signature will be W + W~jj for both K4W4 and ^4 c/4 pair production. Therefore, both up and down type new 
quarks should be considered together since distinguishing between U4 and c/4 quarks with quasi-degenerate 
masses at hadron collider seems to be a difficult task. In this sense, lepton colliders are more advantageous, 
especially if the fourth family quarkonia could be formed. 

Results of the most up-to-date measurements on the quark mixings as published by the Particle Data 
Group [8 1 together with the unitarity assumption of the 4x4 extension of the CKM matrix can be used to 
constrain the fourth-family quark related mixings. The first step is to calculate the squares of the entries in 
the fourth row and column together with their errors: 
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where Vn are the CKM matrix elements and the 8jj are the quoted errors on these measurements. If one 
allows the and to deviate by one sigma, the square root of the sum gives the upper limit for the fourth 
family quark mixings: 



0.97377 ±0.00027 0.2257±0.0021 0.00431 ±0.00030 < 0.044 

0.230 ±0.011 0.957 ±0.095 0.0416 ±0.0006 < 0.46 

4x4 0.0074 ±0.0008 0.0406 ± 0.0027 > 0.78 < 0.47 ( ' 

< 0.063 <0.46 <0.47 > 0.57 

where the lower (upper) limit of (1) is implicitly assumed for all the new entries lf]~3l . 

The remaining of this paper investigates the discovery potential of ATLAS experiment at the LHC 
accelerator for the fourth family quarks in the case where their dominant mixings are to first and second SM 
families as described in the second scenario above. The tree level diagrams for the pair production of the 
new quarks and their subsequent decays are given in Fig. [T]for the c/4 quark decaying via c/4 — > W q (q = 
u,c) . The same diagrams are also valid for the U4 quark production and decay, provided c and u quarks 
are replaced by s and d quarks. The widths of the c/4 and 114 quarks are proportional to |V c / 4 m| 2 + |Kj 4 c| 2 
and |V„ 4C /| 2 + |V M4i -| 2 respectively. Although the extension parameters have much higher upper limits, for 
the event generation and analysis section, the common and conservative value of 0.01 is used for all four 
relevant mixings. As the widths of the new quarks are much smaller than their masses, this selection of the 
new CKM elements has no impact on the pair production cross sections. 



2 Event Generation 

In order to study the possibility of discovery, the four-family model has been implemented into the tree- 
level generator, CompHEP v4.4.3 lfT4l and the pair production of the new quarks at the LHC and their 
subsequent decay into SM particles have been simulated. The QCD scale is set to the mass of the new 
quark under study and CTEQ6L1 set is chosen for the parton distribution functions iflBl . Table[T]gives the 
cross section for the c/4 c/4 production for three example values of c/4 quark mass together with the decay 
widths. As the cross section for 1/4 1Z4 production is within 1% of the c/4 c/4 one, from this point on only 
c/4 will be considered and the results will be multiplied by two to cover all signal processes involving both 
U4 and c/4 quarks. For each of the considered mass values, 12 thousand signal events have been generated 
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Figure 1: The tree-level Feynman diagrams for the pair production and decay of the quarks at the LHC. 

Table 1 : The quark-mass values considered and the associated width and pair production cross sections at 
the LHC. 



m d4 (GeV) 


500 


750 


T(GeV) 


8.23xl0~ j 


2.79x10^ 


<T(pb) 


2.63 


0.250 



for the d^dt, — > W~ W + j j process where j is a jet originating from a quark or antiquark of the first two 
SM families. To benefit from the possible lepton and jet combined triggers and to reduce the ambiguity in 
the invariant mass reconstruction, the hadronic decay of one W boson and the leptonic (electron or muon) 
decays of the other one have been considered. Therefore, the signal is searched for in the 4j + H + E/j final 
state where t is an electron or a muon. 

The backgrounds events originate from all the SM processes whose final state has at least two W bosons 
and two non b-tagged jets. The direct background is from SM events which yield exactly the same final 
state particles as the signal events. The contributions from same sign W bosons are insignificant. Some of 
the indirect backgrounds are also taken into account. The dominant contribution is from t t pair production 
where the b jets from the decay of the top quark could be mistagged as a light jet. Similarly the jet associated 
top-quark pair production ( tt j — > W~W + bb j ) contributes substantially to the SM background as the 
production cross section is comparable to the pair production and only one mistagged fc-jet is sufficient to 
fake the signal events. The cross section for the next-order process, namely pp—>tt2 j, has been computed 
to be four times smaller than tt j process and therefore this process has not been considered. It should be 
noted that the 1 1 and tt j samples have been conservatively added together, in spite of the fact that initial 
and final-state parton showers simulated in Pythia for the former would account for part of the cross section 
for the latter. Finally, background from SM processes with W ± Zqq (q = u,d,s,c) final state has been 
studied. Its contribution to the total background is very similar to the direct (WW jj) background. All the 
mentioned background processes have been generated with MadGraph v3.95 lfT6l . This tree-level generator 
was previously shown to give results in good agreement with CompHEP and to be more suitable for running 
on a computer farmlfTTl. A total of more than 280 thousand events generated at different QCD scales and 
jet selection criteria comprise the background sample. 

The events from both generators are fed into the ATLAS detector simulation and event reconstruction 
framework, ATHENA vl 1.0.41, with the CompHEP events using the interface program CPYTH v2.0.1 
[18|. Parton showering, hadronization and fragmentation are simulated using the ATHENA interface of 
Pythia v6.23 lfT9l and the detector response is obtained from the fast simulation software, ATLFast ll20l . 
This software uses a parameterized function to calculate the final particle kinematic variables rapidly, and 
its output is calibrated to match the results from GEANT-based full detector simulation l2D . The physics 
objects from ATLFast are used in the final analysis in ROOT 5.12 11221 . 
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Figure 2: Distributions of the kinematical observables for the signal events (blue line) as compared to the 
backgrounds (red data points): (a) transverse momentum for the highest-/;^ charged lepton, (b) missing 
transverse momentum, (c) invariant mass of the hadronic W candidates reconstructed from two jets, and 
(d) transverse momentum for the two hardest jets. The distributions in plots a, b, c and d are shown after 
the application of the event selection requirements up to the criteria number 1, 4, 5 and 6 in Table [2] 
respectively. Signal and background histograms have been scaled to the same luminosity, except in plot c, 
where the histograms have been normalized to unit area. 



3 Event Selection and Reconstruction 

The first step of the event selection is the requirement of a single isolated lepton of transverse momentum, 
Pj P ' > 15GeV, and at least four jets with transverse momenta, p J ^' > 20GeV. The transverse momentum 
of the highest-momentum isolated muon in each event is shown in Fig. [2^. The four highest-energy jets are 
required not be ^-tagged, as determined by ATLFASTB fl20l . a fast ^-tagging simulation program, which 
utilizes a pj dependent parameterization of tagging efficiencies. For instance, at high momenta (p J j' > 
lOOGeV) the tagging efficiency for b, c and light jets are 50%, 7.6% and 0.6%, respectively. 

The leptonically decaying W boson is reconstructed by attributing the total missing transverse momen- 
tum in the event, shown in Fig. |2j), to the lost neutrino, and using the nominal mass of the W asa constraint. 
The two-fold ambiguity in the longitudinal direction of the neutrino is resolved by choosing the solution 
with the lower neutrino energy. The four-momenta of the third and fourth most energetic jets in the event 
are combined to reconstruct the hadronically decaying W boson. Due to the high momentum of the W 
boson in the signal events particularly for the high values of the mass, the jets are not always resolved 
in the detector. When this happens, one of the two jets used in the combination is a random jet, which 
spuriously increases the invariant mass, mj,, of the reconstructed W. Such cases cause a long high-end tail 
in the invariant mass distribution for the signal as shown in Fig. |2j:. In order to reduce their adverse effect 
on the final m q4 distributions, events with m-- > 200 GeV are rejected, even though the comparison of the 
distributions for the signal and the background would suggest that a looser criterion would benefit the final 
statistical significance. 

The surviving events are used to obtain the invariant mass of the new quark. Each reconstructed W is 
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Table 2: Efficiencies of the selection criteria, as applied in the order listed, for the ra i/4 =500 (750) GeV signal 
and the largest component of the SM background (ttj). 



# 


Criterion 


e-Signal (%) 


e-Background (%) 


1 


Single e/fl, P j Pt > 15 GeV 


32 (32) 


29 


2 


At least 4 jets, pf > 20 GeV 


86 (84) 


84 


3 


£>-tagging veto 


92 (90) 


33 


4 


Possible neutrino solution 


75 (71) 


76 


5 


mjj < 200 GeV 


50 (44) 


75 


6 


2 hardest jets, pf > lOOGeV 


94 (98) 


35 


7 


|Am^ 4 ; | < lOOGeV 


56 (49) 


50 




Total efficiency, £„// 


5.0(3.6) 


0.8 



Table 3: The expected number of signal and background events and the signal significance for the two 
masses under consideration. 





500 GeV 


750 GeV 


Luminosity 


1 ft- 1 


10 ft- 1 


Signal 


192 


134 


Background 


244 


226 


S/VS + B 


9.2 


7.1 



associated with one of the two hardest jets, for which the minimum transverse momentum requirements are 
tightened to pj > lOOGeV. As observed in Fig. [2}i, this tighter requirement has no significant effect on 
the signal, while substantially reducing the background. A tighter p J j' selection would start to skew the 
final invariant mass distributions. Therefore the lower value of 100 GeV was chosen so that the analysis 
results could be safely interpreted for lower masses as well. The W-jet association ambiguity is resolved 
by selecting the combination which results in the smallest difference between the masses of the two recon- 
structed q4 quarks in the same event. If this mass difference is more than 100 GeV for either combination, 
the event is rejected. The summary of the event selection cuts and their efficiencies for both signal and 
background events are listed in Table [2] for a quark mass of 500 GeV. These selection criteria were not 
optimized for the m q4 = 750GeV case to be safely pessimistic. The results of the reconstruction for quark 
masses of 500GeVand 750GeV are shown in Fig. [3]together with various backgrounds for integrated lu- 
minosities of 1 and 10 ft -1 respectively. The bulk of the background in both cases is due to 1 1 j events as 
discussed before. 

4 Results 

In order to extract the signal significance, an analytical function consisting of a Crystal Ball term l23l to 
represent the background and a Breit-Wigner term to represent the signal resonance is fitted to the total 
number of qn candidates in the invariant mass plots of Fig. [3] In both plots, the fitted function is shown in 
solid black, and its signal component is plotted as a dashed red line. The shape of the background curve was 
also verified against random fluctuations (as in Fig. |3]left side in the 500-600 GeV region of the WWbbj 
curve) by parameterizing the background and then by generating a large sample of pseudoMC experiments. 
It was found that with large statistics the Crystal Ball is a very accurate description of the background 
shape. The extracted number of total signal events is in very good agreement with the actual number of 
events in the signal Monte Carlo sample. The significance is estimated as S/^fS + B, where S(B) is the 
number of signal (background) events determined from the Breit-Wigner (Crystal Ball) term of the fitted 
function. As each event contributes two q^ candidates to the invariant mass histogram, the total number of 
signal (background) events is obtained by taking half of the integral of the signal (background) term within 
±2r (twice the fullwidth at half maximum) of the peak position of the signal. For the case of m ( / 4 =500GeV 
(750GeV), with 1 ft-^lO ft -1 ) of data, the signal significance is found to be 9.2 (7.1). The number of 
events for these two example cases for both signal and background are presented in Table|3] 
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Figure 3: Invariant mass distributions for the reconstructed cj4 candidates from signal and SM background 
events for a quark of mass 500GeV (left) and 750GeV (right). The histograms are populated by two q\ 
candidates per event. The colored solid lines show the backgrounds from various processes, the solid black 
curve represents the fit to the sum of the background and signal events. Also shown in red dashed curve is 
the signal component of the fit. 



5 Conclusion 

The analysis can be extrapolated to other quark mass values to estimate the amount of integrated lu- 
minosity necessary for a discovery. Fig. [4] contains the fourth generation quark {ua, and d\ combined) 
pair production tree-level cross section showing the contributions from gluon fusion and q — q annihilation. 
For the selected parton distribution function, the latter becomes more important at a quark mass of around 
650 GeV. The same figure, on the right-hand side, shows the estimated integrated luminosity required for 
5(7 discovery as a function of the mass of the new quark. The estimates on this plot are based on the cross 
sections shown and the integration of the background function as obtained from the fits presented in the 
analysis section. In all cases, the number of signal events to be collected in order to reach the 5(7 signifi- 
cance is above 20. While this study is based on a fast simulation of the detector response which was not 
fully validated and there are uncertainties associated with the QCD scale, statistical errors etc, we believe 
that the conservative selection cuts and the simplicity of the reconstruction algorithms give reliability to the 
conclusions. 

This study has shown that, if the fourth family quarks mix primarily with the first two generations, 
a clear signal will be observed for the mass range of interest within the first year of the low-luminosity 
running at the LHC. On the other hand, if the mixing matrix is such that the third SM family quarks play the 
dominant role, similar results can be claimed for the u/\ quark, while the discovery of the c/4 quark is likely 
to require more luminosity because of the complexity of the event signature arising from the top-quark 
decays. In either case, the first few years of the LHC data will resolve the discussion on the possibility of 
four SM families within the context of flavor democracy. 
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Figure 4: On the left, the q^qq pair production cross section at the tree level and on the right, the integrated 
luminosity needed for a 5<7 discovery of the signal, both as a function of the new quark mass. Only the pair 
production and the mixing to first two families are considered. 
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